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Abstract 

In this paper, we present that genotype-phenotype mapping can be theoretically inter- 
preted using the concept of quotient space in mathematics. Quotient space can be considered 
as mathematically-defined phenotype space in the evolutionary computation theory. The 
quotient geometric crossover has the effect of reducing the search space actually searched by 
geometric crossover, and it introduces problem knowledge in the search by using a distance 
better tailored to the specific solution interpretation. Quotient geometric crossovers are di- 
rectly applied to the genotype space but they have the effect of the crossovers performed on 
phenotype space. We give many example applications of the quotient geometric crossover. 
Keywords: Geometric crossover, genotype-phenotype mapping, quotient metric space, quo- 
tient geometric crossover. 

1 Introduction 

In evolutionary computation, genotype means solution representation, which is the structure 
that can be stored in a computer and manipulated. Phenotype means solution itself without 
any reference to how it is represented. Sometimes it is possible to have a one-to-one mapping 
between genotypes and phenotypes, so the distinction between genotype and phenotype becomes 
purely formal. However, in many interesting cases, phenotypes cannot be represented uniquely 
by genotypes. So the same phenotype is represented by more than one genotype. In such case 
we say that we have a redundant representation. For example, to represent a graph we need to 
label its nodes and then we can represent it using its adjacency matrix. This representation is 
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redundant because the same graph can be represented with more than one adjacency matrix by 
relabeling its nodes. 

There are quite a few problems in that it is hard to represent one phenotype by just one 
genotype using traditional representations. Roughly speaking, redundant representation leads 
to severe loss of search power in genetic algorithms, in particular, with respect to traditional 
crossovers [3]. To alleviate the problems caused by redundant representation, a number of 
methods such as adaptive crossover have been proposed [BJ [TBI [2"4l 131] . Among them, a tech- 
nique called normalization is representative. It transforms the genotype of a parent to another 
genotype to be consistent with the other parent so that the genotype contexts of the parents 
are as similar as possible in crossover. There have been a number of successful studies using 
normalization. An extensive survey about normalization is appeared in [3]. 

We recognized that genotype-phenotype mapping can be theoretically interpreted using the 
concept of quotient space in mathematics. In this paper, we formally present the general relation 
between the notion of quotient and genotype-phenotype mapping, and we study the relation 
between genotype and phenotype spaces and geometric crossovers on them. 

For analysis, we adopted the concept of geometric crossover [20] because it is representation- 
independent and well-defined once a notion of distance in the search space is defined. In this 
study, we consider only genotype and phenotype spaces that are metric spaces. So the metric 
for a space is considered as the most important characteristic of its structure. This approach 
enables to deal with the problem spaces more mathematically. 

The remainder of the paper is organized as follows. In Section [21 we preliminarily present 
some necessary mathematical notions and the geometric framework. In Section [3l the new notion 
of quotient geometric crossover is introduced in connection with genotype-phenotype mapping. 
In Section we study several useful examples. In Section 14.11 and 14. 2\ we show how previous 
work on groupings [H] and graphs can be recast and understood more simply in terms of quotient 
geometric crossover. In such problems, quotient geometric crossover has the effect of filtering 
out inherent redundancy in the solution representation. In Section I4.3[ we consider symmetric 
functions and their problem spaces. The usage of the quotient geometric crossover for circular 
permutation encodings is discussed in Section 14.41 In Section 14.51 we show how homologous 
crossover for variable-length sequences [23] can be understood as a quotient geometric crossover. 
Finally we give our conclusions in Section 

2 Preliminaries 

2.1 Mathematical Notions 

In the following, we give some known mathematical definitions required to present our idea. 

Given a set X and an equivalence relation ~onI, the equivalence class of an element a in 
X is the subset of all elements in X that are equivalent to a: 

a = {x G X : a ~ x}. 

1 The term of normalization is firstly appeared in [TTJ. However, it is based on the adaptive crossovers proposed 
in pIlEI]. 
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The set of all equivalence classes in X given an equivalence relation ~ is usually denoted by 
X/~ and called the quotient set of X by ~. This operation can be thought (very informally) as 
the act of "dividing" the input set by the equivalence relation. The quotient set is considered 
as the set with all the equivalent points identified as a point. 

Next, group [8] is introduced. A group is an algebraic structure consisting of a set together 
with an operation that combines any two of its elements to form a third element. To qualify 
as a group, the set and operation must satisfy a few conditions called group axioms, namely 
associativity, identity, and invertibility. Formally it is defined as follows: 

Definition 1 (Group). A group (Q, *) is a set Q closed under a binary operation *, such that 
the following axioms are satisfied: 

(i) Associativity: for all a,b,c £ G, we have 

(a * b) * c = a * (b * c). 

(ii) Identity: there is an element e in G such that for all x £ G, 

e*x = x*e = x. 

(Hi) Invertibility: for each a£G, there is an element a^ 1 in G such that 

a * a -1 = a -1 * a = e. 

In this paper, we will use groups for constructing equivalence relations with good properties. 

In the following, we view the problem spaces as metric spaces. It is a reasonable assumption 
because the general solution spaces usually have metrics. For example, binary space has the 
Hamming distance and real space has Minkowski distances including the Euclidean distance. 
Formally, the term metric - or distance - denotes any real-valued function that conforms to the 
axioms of identity, symmetry, and triangular inequality. Now, we introduce an isometry on a 
metric space. 

Definition 2 (Isometry). Let (X,d) be a metric space. If f: X —> X satisfies the condition 

d(f(x)J(y)) = d(x,y) 
for all x,y £ X, then f is called an isometry of X. 

The set of isometries on X is denoted by Iso(X). Iso(X) forms a group under function compo- 
sition operator. In our study, an isometry subgroup Q C Iso{X) will be considered to generate 
an equivalence relation for quotient metric space. 

2.2 Geometric Preliminaries 

In this subsection we provide some geometric definitions, which extend those introduced in 
[181 [TU] . The following definitions are taken from [3] . 

In a metric space (X, d), a line segment (or closed interval) is the set of the form [a^yjd = 
{z £ X | d(x, z) + d(z,y) = d(x, y)}, where x,y £ X are called extremes of the segment. Metric 



3 



segment generalizes the familiar notions of segment in the Euclidean space to any metric space 
through distance redefinition. Notice that a metric segment does not coincide to a shortest path 
connecting its extremes (geodesic) as in an Euclidean space. In general, there may be more than 
one geodesic connecting two extremes; the metric segment is the union of all geodesies. 

We assign a structure to the solution set X by endowing it with a notion of distance d. 
M = (X, d) is therefore a solution space and (M, /) is the corresponding fitness landscape, 
where / is the fitness function over X. 

2.3 Geometric Crossover 

Geometric crossover is a representation-independent search operator that generalizes many pre- 
existing search operators for the major representations used in evolutionary algorithms, such as 
binary strings [IB], real vectors [13 [33], permutations [20], permutations with repetition [17] . 
syntactic trees [19] . sequences [23], and sets [21j . It is defined in geometric terms using the 
notions of line segment and ball. These notions and the corresponding genetic operators are 
well-defined once a notion of distance in the search space is defined. Defining search operators 
as functions of the search space is opposite to the standard way [10] in which the search space 
is seen as a function of the search operators employed. This viewpoint greatly simplifies the 
relationship between search operators and fitness landscape and has allowed us to give simple 
rules- of -thumb to build crossover operators that are likely to perform well. 

The following definitions are representation-independent therefore applicable to any repre- 
sentation. 

Definition 3 (Image set). The image set Im[OP] of a genetic operator OP is the set of all 
possible offspring produced by OP. 

Definition 4 (Geometric crossover). A binary operator GX is a geometric crossover under the 
metric d if all offspring are in the segment between its parents x and y, i.e., 

Im[GX(x,y)] C [x;y] d . 

A number of general properties for geometric crossover have been derived in [18] where it was 
also shown that traditional mask-based crossovers are geometric under the Hamming distance. 
Moraglio and Poli also studied various crossovers for permutations, revealing that PMX (par- 
tially matched crossover) [9], a well-known crossover for permutations, is geometric under swap 
distance. Also, they found that cycle crossover [25], another traditional crossover for permuta- 
tions, is geometric under swap distance and under the Hamming distance. 

Theoretical results of metric spaces can naturally lead to interesting results for geometric 
crossover. In particular, Moraglio and Poli showed that the notion of metric transformation 
has great potential for geometric crossover in [22]. A metric transformation is an operator 
that constructs new metric spaces from pre-existing metric spaces: it takes one or more metric 
spaces as input and outputs a new metric space. The notion of metric transformation becomes 
extremely interesting when considered together with distances firmly rooted in the syntactic 
structure of the underlying solution representation (e.g., edit distance). In these cases it gives 
rise to a simple and natural interpretation in terms of syntactic transformations. 
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Moraglio and Poli extended the geometric framework introducing the notion of product 
crossover associated with the Cartesian product of metric spaces in |22j. This is a very important 
tool that allows one to build new geometric crossovers customized to problems with mixed 
representations by combining pre-existing geometric crossovers in a straightforward way. Using 
the product geometric crossover, they also showed that traditional crossovers for symbolic vectors 
and blend crossovers for integer and real vectors are geometric crossover. 

3 Quotient Geometric Crossover 
3.1 Motivation 

Geometric operators are defined as functions of the distance associated to the search space. 
However, the search space does not come with the problem itself. The problem consists only of 
a fitness function to optimize, that defines what a solution is and how to evaluate it, but it does 
not give any structure on the solution set. The act of putting a structure over the solution set 
is a part of the search algorithm design and it is a designer's choice. 

A fitness landscape is the fitness function plus a structure over the solution space. So, for 
each problem, there is one fitness function but as many fitness landscapes as the number of 
possible different structures over the solution set. In principle, the designer can choose the 
structure to assign to the solution set completely independently from the problem at hand. 
However, because the search operators are defined over such a structure, doing so would make 
them decoupled from the problem at hand, hence turning the search into something very close 
to random search. 

To avoid such problem, one can exploit problem knowledge in the search. It can be achieved 
by carefully designing the connectivity structure of the fitness landscape. For example, one can 
study the objective function of the problem and select a neighborhood structure that couples the 
distance between solutions and their fitness values. Once it is done, problem knowledge can be 
exploited by search operators to perform better than random search, even if the search operators 
are problem- independent (as is the case of geometric operators). Indeed, the fitness landscape is 
a knowledge interface between the problem at hand and a formal, problem-independent search 
algorithm. 

Under which conditions is a landscape well-searchable by geometric operators? As a rule 
of thumb, geometric crossover works well on landscapes where the closer pairs of solutions, the 
more correlated their fitness values. Of course this is no surprise: the importance of land- 
scape smoothness has been advocated in many different contexts and has been confirmed in 
uncountable empirical studies with many neighborhood search meta-heuristics [27]. We operate 
according to the following rules-of-thumb: 

Rule- of -thumb 1: if we have a good distance for the problem at hand, then we have a good 
geometric crossover. 

Rule- of -thumb 2: a good distance for the problem at hand is a distance that makes the landscape 
"smooth." 
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Figure 1: Diagram linking genotype, phenotype spaces, and geometric crossovers 
3.2 Genotype-Phenotype Mapping 

We formally present the general relation between the notion of quotient and genotype-phenotype 
mapping. Let G and P be genotype space and phenotype one, respectively. Consider a genotype- 
phenotype mapping q: G —* P that are not injective (i.e., redundant representation). The 
mapping q induces a natural equivalence relation ~ on the set of genotypes: genotypes with 
the same phenotype belong to the same class. Then the phenotype space P becomes exactly a 
quotient space G/~ of the genotype space G. 

The advantage of geometric crossover is that we can formally define a geometric crossover 
under the distance once a distance is defined. Then what if the quotient space Gj ~ has a 
distance dp induced by the distance do of Gl If so, the geometric crossover under dp would 
be a natural crossover since it reflects the structure of the genotype space G by involving the 
distance dc of G. 

By applying the formal definition of geometric crossover to the metric spaces (G, do) and 
(P,dp), we obtain the geometric crossovers GXq and GXp, respectively. GXq searches the 
space of genotypes and GXp searches that of phenotypes. Searching the space of phenotypes 
has a number of advantages: (i) it is smaller than the space of genotypes, hence quicker to 
search (ii) the phenotypic distance is better tailored to the underlying problem, hence the corre- 
sponding geometric crossover works better (iii) the space of phenotypes has different geometric 
characteristics from the genotypic space. It can be used to remove unwanted bias from geometric 
crossover. 

However, the crossover GXp cannot be directly implemented because it recombines pheno- 
types that are objects that cannot be directly represented. So, we propose a notion of quotient 
geometric crossover to search the space of phenotypes with the crossover GXp indirectly by 
manipulating the genotypes G. The relationship among G, P, and their geometric crossovers is 
illustrated through a diagram in Figure [TJ 

In the next subsection we present more formally the concept of quotient metric space and 
quotient geometric crossover in relation with genotype-phenotype mapping. 
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3.3 Quotient Metric Space 

Let (X,d) be a metric space and (£/,•) C Iso(X) be a subgroup of the isometry group, where 
• means the function composition operator. We introduce a relation x and y in X are 
equivalent if and only if x = g(y) for some g G Q. Then ~g is an equivalence relation by the 
following proposition. 

Proposition 1. Relation ~g is an equivalence relation. 
Proof. Assume that x, y, and z e X. 

(i) Reflexivity: (Since Q is a group, identity map e is in Q.) Since x = e(x), x ~g x. 

(ii) Symmetry: Suppose that x ~g 2/> i- e -; x = 5(2/) f° r some g G G. There exists g^ 1 G G since 
G is a group. Then, y = g~ l {x). So y ~g x. 

(hi) Transitivity: Suppose that x ~g 2/ and y ~g z. x = y(y) and y = /i(2;) for some g and 
/i G G. Then x = (/(y) = g{h(z)) = (<? • h){z). g ■ h is in G since G is a group. Hence, x~g z. □ 

For x £ X, equivalence class x can be written as x = {g(x) : g G ^}. Now we will give a 
metric on X/^c (usually denoted by just X/Q) induced by the original metric d on X. 

Definition 5 (Quotient metric). Quotient metric d(x, y) is defined as min{d(x', y') : x' G x, y' G 
.</}• 

It is shown that J is actually a metric on X/Q in the following proposition. 
Proposition 2. (X/G,d) is a metric space, i.e., d is a metric in X/G. 

Proof. Assume that x, y, and z G X. 

(i) Identity: < d(x, x) < d(x, x) = 0. 

(ii) Symmetry: There exist x\ G x and y\ G y such that d(x,y) = d{x\,yi). Then, d(x,y) = 
d(xi,yi) = d(y 1 ,x 1 ) > d(y,x). Similarly, d(y,x) > d(x,y). Hence, d(x,y) = d(x 1 ,y 1 ). 

(hi) Triangular inequality: There exist x\ G x, and y\ G y such that d(x,y) = d(x\,yi). Also, 
There exist y2 G y and Z2 G z such that d(y,z) = d(y2, Z2). Since yi and y2 belong to the same 
equivalence class, there exists g G G such that y\ = (7(7/2) • Then, 

= d(x 1 ,y 1 ) + d(y 2 ,z 2 )) 
= d(x 1 ,g(y 2 )) + d(y 2 ,z 2 ) 

= d(x 1 ,g(y 2 ))+d{g(y 2 ),g(z2)) ('.' g G Iso(X).) 

> d(xi,g(z 2 )) (•.• ci is a metric in X.) 

> d(x,z). {■: z 2 ~g g{z 2 ).) 

□ 

The following proposition gives a simpler definition of quotient metric d. 
Proposition 3. If we let d(x,y) := min{d(x, y') : y' G y}, d(x,y) = d(x,y). 
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Proof. Let x, y G X/G. It is clear that d(x, y) > y) by the definition. Now we will show 
that d(x,y) < d(x,y). Suppose that d(x,y) = d(xi,yi). Since x and x\ belong to the same 
equivalence class, there exists g G G such that x = g{x\). Since g is an isometry, d(xi,y\) = 
d(g(x\),g(yi)) = d(x,g(yi)). g(yi) ~ G yi ~ G y . So d(x,y) < d{x x ,yx) = d{x,y). □ 

This metric space (X/Q,d) is called quotient metric space. Quotient space conceptually 
corresponds to the phenotype space. The line segment and the geometric crossover in the 
quotient metric space are defined in the same way as in other metric spaces. However, since 
in general cases solutions are represented only in the genotype space, we need to define line 
segments and crossovers on (X,d), not on (X/Q,d), to practically apply the concept. 

In a metric space (X,d), a quotient line segment is the set of the form = {z G 

X | d(x, z) + d(z, y) = d(x, y), z G X/G}, where x, y G X/G. 

Proposition 4. Ifd(x,y) = d(x,y*), [x;y*] d C [x;y]j. 

Proof. Let z G [x;y*]^. Then, d(x, z) + d(z,y) < d(x,z) +d(z,y*) = d(x,y*) = d(x,y). Since 
d(x, z) + d(z, y) > d(x, y) by the property of triangular property, d(x, z) + d(z, y) = d(x, y). So, 
ze[x;y] d . □ 

Now we can define the quotient geometric crossover. 

Definition 6 (Quotient geometric crossover). A binary operator GX q is a quotient geometric 
crossover under the metric d and the equivalence relation ~c if all offspring are in the quotient 
line segment between its parents x and y, i.e., GX q (x,y) C [x;y]j. 

In the following we define the induced quotient crossover, which is a kind of quotient geometric 
crossovers. This crossover is defined using the original geometric crossover under the original 
distance d. This crossover is not only concrete but also easily implemented while the quotient 
geometric crossover is conceptual. 

Definition 7 (Induced quotient crossover). First, find y* in the equivalence class y of the 
second parent y such that d(x,y) = d(x,y*). Then, do the geometric crossover on X using the 
first parent x and the normalized second parent y* . 

Corollary 1. Induced quotient crossover is a quotient geometric crossover. 

Proof. Let y* G y be a normalized second parent i.e., d(x,y) = d{x,y*). Then, Q [ x ]y]d 

by Proposition HI This satisfies the definition of quotient geometric crossover. □ 

Induced quotient crossover can be a bridge between the original geometric crossover and the 
quotient geometric crossover. We can redraw Figure [U including the induced quotient crossover. 
It is shown in Figure O 

In the next section, we consider a number of equivalence classes for the quotient operation 
and its related induced genotypic crossover transformation. 
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Figure 2: Diagram linking genotype, phenotype spaces, and crossovers including induced quo- 
tient crossover 



4 Examples 

In this section, we introduce examples of quotient geometric crossovers. In some examples 
(groupings in Subsection l4.1l and sequences in Subsection l4.5|) . previously proposed crossovers are 
reinterpreted as quotient geometric crossovers. In other examples, we introduce new crossovers 
consistent with genotype-phenotype mapping of given problems using the concept of quotient 
geometric crossover. 



4.1 Groupings 

Grouping problems [7] are commonly concerned with partitioning given item set into mutually 
disjoint subsets. Examples belonging to this class of problems are multiway graph partitioning, 
graph coloring, bin packing, and so on. Grouping representation is also used to solve the 
joint replenishment problem, which is a well-known problem appeared in the field of industrial 
engineering [26]. In this class of problems, the normalization decreased the problem difficulty 
and led to notable improvement in performance. 

Most normalization studies for grouping problems were focused on the k-w&y partitioning 
problem. In the problem, the fe-ary representation, in which k subsets are represented by the 
integers from to — 1, has been generally used. In this case, one phenotype (a fe-way partition) 
is represented by k\ different genotypes. In the problem, a normalization method was used in 
Other studies for the A;- way partitioning problem used the same technique [T2] . In the sense 
that normalization pursues the minimization of genotype inconsistency among chromosomes, in 
|13j . Kim and Moon proposed an optimal, efficient normalization method for grouping problems 
and a distance measure, the labeling-independent distance, that eliminates such dependency 
completely. 

Now we reinterpret the previous work in terms of quotient space. Let a, b G X = {1, 2, . . . , k} n 
be fc-ary encodings (fixed-length vectors on a fc-ary alphabet) and be a set of all permutations 
of length k. For each ff G Ej,, we can view a as a function on X by defining o~(a) be a permuted 
encoding of a by a permutation a. For example, in the case that a = (1,2,3,3,2,4,1,4) is a 
4-ary encoding and a = g J jj J) € E 4 , a{a) = (2, 4, 3, 3, 4, 1, 2, 1). 



9 







X = 


{1,2,3} 4 










1,2,3,1) 


y = (2,1,2,3) £1 






y 


H(x,a(y)) 


01 


= (1,2,3) 


01 (y) = 


(2,1,2,3) 


H{x,a\(y)) = 


4 


02 


= (1,3,2) 


02 (y) = 


(3,1,3,2) 


H(x,a 2 {y)) = 


3 


03 


= (2,1,3) 


03 (y) = 


(1,2,1,3) 


H(x,a 3 (y)) = 


2 


04 


= (2,3,1) 


04 (y) = 


(3,2,3,1) 


H(x,a 4 {y)) = 
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06 
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3 



y* = 4 (y) = (3,2,3,1) 

d(x, y) = 1 



Figure 3: An example of grouping 

It is well known that permutations form a group. Hence, is a group. Moreover, when 
we use the Hamming distance H on X, it is easy to check that each a £ is an isometry. So 
~E fe becomes an equivalence relation and the quotient metric in Definition is well defined by 
Proposition [2j The quotient metric was introduced as labeling-independent distance in [13]. In 
the context of this study, it is rewritten as follows: 

d(a,b) := min H(a,a(b)). 

An example case is shown in Figure El 

The definition of labeling-independent crossover presented in p3] is in the following. 

Definition 8 (Labeling-independent crossover). Normalize the second parent to the first under 
the Hamming distance. Then, do the normal crossover using the first parent and the normalized 
second parent. 

This crossover is exactly the induced quotient crossover. For the process of normalization, it is 
possible to enumerate all k\ permutations and find an optimal one among them. However, for 
a large k, such a procedure is intractable. Fortunately, it can be done in 0(k 3 ) time using the 
Hungarian method proposed by Kuhn |15j . 
In summary, we have the following. 
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Isometry group Q 
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Metric d on X 
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Original geometric crossover 
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Labeling-independent crossover in LL4J 



From understanding normalization for grouping problems in terms of quotient geometric 
crossover, we can understand the benefit of normalization in terms of landscape analysis. We 
have already done this in our previous work |17j . 

4.2 Graphs 

In this subsection, we consider any problem naturally defined over a graph in which the fitness of 
the solution does not depend on the labels on the nodes but only on the structural relationship, 
i.e., edge between nodes. 

Formally, let A £ 9Jt n be the adjacency matrix of a labeled graph using labels of n nodes 
and let V n be a set of all n x n permutation matrices^. Then, for each permutation matrix 
P € V n , the matrix PAP T means the labeled graph obtained by relabeling A according to the 
permutation represented by P. The fitness / : 9Jt n — ► R satisfies that for every A G 9Jt n and 
every permutation matrix P, f(A) = f(PAP T ). 

Let (DJl n ,H) be a metric space on the labeled graphs under the Hamming distance H. Notice 
that this metric is labeling-dependent. In particular, H(A, PAP T ) may not be zero although A 
and PAP T represent the same structure. If A is equal to PBP T for some permutation matrix 
P, we define A and B to be in relation ~p n , i.e., A ~p n B. Since a set of permutation matrices 
V n forms a group, the relation ~p n is an equivalence relation by Proposition [TJ 

The equivalence class A is represented as follows: 

A := {PA : P G V n }- 

It corresponds to an unlabeled graph and the quotient space < HfR n /T' n can be understood as 
unlabeled- graph space. Tl n /V n is a quotient metric space by Proposition [5J So we obtain induced 
quotient metric on yK n jV n - It can be written as follows: 

d(A,B) = min H(A,PB). 

PePn 

An example for graphs is shown in Figure HI 

Now we design induced quotient crossover. In this example, the process of finding the 
normalized second parent y* can be understood as graph matching in terms of graphs which are 
not adjacency matrices. 

Definition 9 (Induced quotient crossover for graphs). Do the graph matching of the second 
parent B to the first A under the Hamming distance H , i.e., 

B* := argmmH(A,B'). 

B'eB 

2 Permutation matrix is a (0, l)-matrix with exactly one 1 in every row and column. 
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Figure 4: An example of graph 
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Then, do the normal crossover using the first parent A and the graph-matched second parent B* . 

The induced quotient crossover is denned over unlabeled graphs 97t„/'P n . This space is much 
smaller than labeled graphs 3Jt n . More precisely, \Tl n /V n \ = |9JT n |/n!. This means that the more 
the labels are, the smaller the unlabeled-graph space is when compared with the labeled-graph 
space. Smaller space means better performance, given the same amount of evaluations. 

Now we tell how to guide the implementation using graph matching for specific geometric 
crossovers. To implement the geometric crossover over unlabeled graphs, we need to use la- 
beled graphs. The labeling results are necessary to represent and handle the solution, even if 
in fact it is only an auxiliary function and can be considered as not being part of the problem 
to solve. Graph matching before crossover allows to implement the geometric crossover on the 
unlabeled-graph space. We use the corresponding geometric crossover over the auxiliary space 
of the labeled graph after graph matching. 



Example 


Graphs 


Genotype space X 


9Jt n (the set of all n x n adjacency matrices) 


Isometry group Q 

inducing phenotype space X/Q 


V n : a set of all n x n permutation 
matrices 


Metric d on X 


Hamming distance H 


Original geometric crossover 


traditional crossover on adjacency matrices 
seen as length-n 2 vectors 


Induced quotient crossover 


graph matching before traditional crossover 
(newly introduced in this study) 



By applying the quotient geometric crossover on graphs, we can design a crossover better 
tailored to graphs. The notion of graph matching before crossover arises directly from the defi- 
nition of quotient geometric crossover. Graphs are very important because they are ubiquitous. 
In future work we will test this crossover on some applications. Graphs and groupings can be 
seen as particular cases of labeled structures in which the fitness of a solution depends only on 
the structure and not on the specific labeling. In future work we will also study the class of 
labeled structures in combination with quotient geometric crossover. 

4.3 Symmetric Functions 

A symmetric function on n variables X\,X2, ■ ■ ■ ,x n is a function that is unchanged by any 
permutation of its variables. That is , if f(xi,X2, ■ ■ ■ ,x n ) = f{x cr {l),x <J {2), . . . ,x a (n)) for any 
permutation a, the function / is called symmetric function. In this subsection, we consider 
problems of which fitness function is symmetric. Some evolutionary studies have been made on 
such problems [2S\ I32j. More properties about specific symmetric functions are introduced in 
[21130]. 

Solutions for symmetric functions are typically represented as n-dimensional vectors, i.e., 
length-n strings. Let X be the solution space (or domain) of given symmetric function and S n 
be a set of all permutations of length n. Similarly to the example of grouping in Section 14.11 
a £ S n can be understood as a function. For example, in the case that x = (x\, X2, £3, X4) and 
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A = R 3 



x = (1,4,5), y = (3,0,6) £l 



s 3 


2/ 


H(x,a(y)) 


a l = (1,2,3) 
o- 2 = (1,3, 2) 
0-3 = (2, 1,3) 
0-4 = (2, 3,1) 
cr 5 = (3,1,2) 

(7 6 = (3, 2,1) 


ai(y) = (3,0,6) 
a 2 (y) = (3,6,0) 
a 3 (y) = (0,3,6) 
a 4 (y) = (0,6,3) 
a 5 (y) = (6,3,0) 
<7 6 60 = (6,0,3) 


E(x,a 1 (y)) = V21 
E{x,a 2 {y)) = ^33 
£?(x,(7 3 (y))=V3 
E{x,a±{y)) = 3 
J B(x, C 75(y)) = \/5T 
£(x,<7 6 (y)) = 3^5 



y* = ^s(y) = (0,3,6) 

y) = a/3 



Figure 5: An example of symmetric function under the Euclidean distance 



a 



(2 4 3 1) e 5 f cr( x ) = ( x 2, X4, X3, ^i)- As mentioned in Section Wl\ S n is a group and each 
a is an isometry. 

If A is a real space, we can use the Euclidean distance E. In that case, induced quotient 
metric on X/Q is defined as follows: 

d(x,y) := min E(x,cr(y)) 

Figure [5] will be helpful to understand the quotient metric space for this case. 

Induced quotient crossover can also be defined as in Definition [7J Because it uses permuta- 
tion, it can be performed in 0(n 3 ) time by the Hungarian method similarly to groupings. 

Summary for this Euclidean case is as follows: 



Example 


Symmetric functions on real space 


Genotype space X 


R n 


Isometry group Q 

inducing phenotype space X/Q 


T, n : a set of all permutations 
of length n 


Metric d on A 


Euclidean distance E 


Original geometric crossover 


traditional crossover on real vectors 


Induced quotient crossover 


rearranging before traditional crossover 
(newly proposed in this study) 



On the other hand, if A is a discrete space as in binary or A;-ary encoding, we can use the 
Hamming distance. Then, induced quotient metric on X/Q is defined as follows: 



d(x,y) := min H(x,a(y)) 
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Induced quotient crossover for this case can also be performed in 0(n 3 ) time by the Hungarian 
method. In sum, we have: 



Example 


Symmetric functions on discrete space 


Genotype space X 


{0,1}™ 


Isometry group Q 

inducing phenotype space X/Q 


S n : a set of all permutations 
of length n 


Metric don J 


Hamming distance H 


Original geometric crossover 


traditional crossover on binary or fc-ary vectors 


Induced quotient crossover 


rearranging before traditional crossover 
(newly proposed in this study) 



4.4 Circular Permutations 

Here we consider the case that solutions of a problem are represented as circular permutations 
such as traveling salesman problem (TSP). Gluing head and tail of the permutation obtains a 
circular permutation. Circular permutations cannot be represented directly. They are typically 
represented with simple permutations. Then each circular permutation is represented by more 
than one permutation. For example, permutations (1,2,3), (2,3,1), and (3,1,2) represent the 
same phenotype, i.e., circular permutation. In such problem, the genotype space is a set of 
permutations and the phenotype space is a set of circular permutations. We can consider this 
problem in view of genotype-phenotype mapping using the concept of quotient space. 

Let E n be a set of all permutations with length n. A function Sk ■ S n — > T, n is defined by 
k-step circular shift operation to right. For example, S2(l, 2, 3) = (2,3,1). A set of all shift 
operations S n = {sk ■ k = 0, 1, 2, . . . , n — 1} is a group. And it is easy to check that each s& is 
an isometry on S n . If S n has a metric, T> n /S has an induced quotient metric by Proposition [2j 

Now we consider various distances for permutation encoding. The most typical distance is 
the Hamming distance H. Under the Hamming distance, it is known that cycle crossover is 
geometric [19] , In this case, quotient metric is defined as follows: 

d(x,y) := min H(x, s(y)). 

seS n 

An example case is shown in Figure El 

Then we can define induced quotient crossover. 

Definition 10 (Position-independent cycle crossover). Normalize the second parent to the first 
under the Hamming distance H . Then, do the cycle crossover using the first parent and the 
normalized second parent. 

Normalizing the second parent takes 0(n) time because the equivalence class of the second 
parent has exactly n elements by shift operations. 

Cycle crossover is also geometric under the swap distance. The induced quotient crossover 
can be defined in a similar way using the swap distance instead of the Hamming distance. Sum- 
mary for the case of applying the cycle crossover is as follows: 
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X 


= (2,4, 


5, 1, 6, 3), y = (4, 6, 1,5, 3, 2) el 


s & 


y 


H(x,s(y)) 


«0 


so(y) 


= (4,6,1,5,3,2) 


H(x,s (y)) = 


6 


Si 


si(y) 


= (2,4,6,1,5,3) 


H{x,si(y)) = 


2 


S2 


S2(y) 


= (3,2,4,6,1,5) 


H(x,s 2 (y)) = 


6 


S3 


S3(y) 


= (5,3,2,4,6,1) 


H(x,s 3 {y)) = 


5 


S4 


84(1/) 


= (1,5,3,2,4,6) 


H(x,s 4 {y)) = 


6 


S5 




= (6,1,5,3,2,4) 


H(x,s 5 (y)) = 


5 



y* = Sl (y) = (2,4,6,l,5,3) 
d(x,y) = 2 



Figure 6: An Example of circular permutation under the Hamming distance 



Example 


circular permutations 


Genotype space X 


S n 


Isometry group Q 

inducing phenotype space X/Q 


S n : a set of all shift 
operations 


Metric d on X 


Hamming distance H (or swap distance) 


Original geometric crossover 


cycle crossover 


Induced quotient crossover 


rearranging before cycle crossover 
(newly proposed in this study) 



On the other hand, we can use another well-known distance for S n - reversal distance. Its 
neighborhood structure is the one based on the 2-opt move. The reversal move selects any two 
points along the permutation then reverses the subsequence between these points. This move 
induces a graphic distance between circular permutations: the minimum number of reversals 
to transform one circular permutation into the other. The geometric crossover associated with 
this distance belongs to the family of sorting crossovers [19] : it picks offspring on the minimum 
sorting trajectory between parent circular permutations sorted by reversals. 

Definition 11 (Position-independent sorting-by-reversals crossover). Normalize the second par- 
ent to the first under the graphic distance. Then, do the crossover based on sorting by reversals 
for permutation using the first parent and the normalized second parent. 
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Example 


circular permutations 


Genotype space X 




Isometry group Q 

muucnig pnenoTvpe apace sv i y 


S n : a set of all shift 
opei diions 


Metric d on X 


reversal distance 


Original geometric crossover 


sorting-by-reversals crossover 
for permutations [19] 


Induced quotient crossover 


sorting-by-reversals crossover 
for circular permutation [19J 



There is a problem in implementing the geometric crossover under the reversal distance. Sort- 
ing linear or circular permutations by reversals is NP-hard [TJ [29] . So, the geometric crossover 
under the reversal distance cannot be implemented efficiently. Nevertheless, this example of 
quotient geometric crossover illustrates how to obtain a geometric crossover for a transformed 
representation (circular permutation) starting from a geometric crossover for the original repre- 
sentation (permutation). So in this case quotient geometric crossover is used as a tool to build a 
new crossover for a derivative representation from a known geometric crossover for the original 
representation. From [19], we know that the sorting-by-reversals crossover for permutations is 
an excellent crossover for TSP. In future work we want to test the sorting-by-reversals crossover 
for circular permutations. Since they are a direct representation, we expect it to perform even 
better. 

4.5 Sequences 

An application in this subsection is not exactly fitted to a quotient framework by the isometry 
subgroup like applications introduced earlier. However, we present this application because 
it follows the quotient approach except that the equivalence relation is not from an isometry 
subgroup. 

We recast alignment before recombination in variable-length sequences as a consequence of 
quotient geometric crossover. Consider the case that we use stretched sequences as genotypes 
of sequences. Stretched sequences mean sequences created by interleaving '-' anywhere and in 
any number in the sequences. We can define a relation ~ on stretched sequences: each stretched 
sequence belongs to the class of its unstretched version. Then, we can easily check that the 
relation ~ is an equivalence relation. 

In [23], Moraglio et al. have applied geometric crossover to variable-length sequences. The 
distance for variable-length sequences they used there is the edit distance LzH: the minimum 
number of insertion, deletion, and replacement of single character to transform one sequence into 
the other. The geometric crossover associated with this distance is proposed in [23]. It is called 
homologous geometric crossover, two sequences are aligned optimally before recombination. 
Alignment here means allowing parent sequences to be stretched to match better with each 
other. Two parent stretched sequences are aligned by interleaving or removing '-' to create two 
stretched sequences of the same length that have minimal Hamming distance. For example, if we 
want to recombine agcacaca and acacacta, we need to align them optimally first: agcacac-a 

3 The notation LD comes from Levenshtem distance that is another name of edit distance. 
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and a-cacacta. Notice that the Hamming distance between the aligned sequences is less than 
the Hamming distance between the non-aligned sequences. After the optimal alignment, one 
does the normal crossover and produce a new stretched sequence. The offspring is obtained by 
removing so by unstretching the sequence. 

From [23j, we can easily check that edit distance for sequences is a metric and hence the 
phenotype space - the space of variable-length sequences - is a quotient metric space. In fact, the 
edit distance corresponds to a quotient metric and the homologous geometric crossover corre- 
sponds to induced quotient crossover. Suppose that we deal with only genotypes, i.e., stretched 
sequences. We leave offspring produced by homologous crossover just stretched - not removing 
Then the offspring exactly lies on quotient line segment. So the crossover is a quotient 
geometric crossover in terms of stretched sequences. In sum, we have: 



Example 


sequences 


Genotype space X 


stretched sequences 


Equivalence relation ~ 
inducing phenotype space X/~ 


stretched sequences 

with the same unstretched sequence 


Metric donX 


edit distance 


Original geometric crossover 


traditional crossover on stretched sequences 


Induced quotient crossover 


homologous crossover |23j 



Phenotypes are variable-length sequences that are directly representable. So in this case the 
quotient geometric crossover is not used to search a non-directly representable space (pheno- 
types) through an auxiliary directly representable space (genotypes). The benefit of applying the 
quotient geometric crossover on variable-length sequences is that the homologous crossover over 
sequences GXp is naturally understood as a transformation of the geometric crossover GXq over 
stretched sequences G rather than a crossover acting directly on sequences P. This is because 
the notion of optimal alignment is inherently defined on stretched sequences and not on simple 
sequences. In [23], Moraglio et al. have tested the homologous crossover on the protein motif 
discovery problem. In future work we want to study how the optimal alignment transformation 
affects the fitness landscape associated with geometric crossover with and without alignment. 

5 Concluding Remarks 

In this paper we have mathematically analyzed genotype and phenotype spaces by introducing 
the notion of quotient space. Phenotype space can be regarded as quotient space by a genotype- 
phenotype mapping. Geometric crossovers has the advantage in that they can also be formally 
defined once a distance is defined. Owing to this advantage we can connect a solution space - as a 
metric space - and crossovers. Moreover, geometric crossover based on the appropriate distance 
of a space reflects properties of given space. We introduced quotient metric on phenotype space. 
Since the quotient metric is a part of the phenotype space structure, the geometric crossover by 
the metric reflects the properties of phenotype space more effectively than the original geometric 
crossover. 

As shown in application examples, quotient geometric crossover is not only theoretically sig- 
nificant but also has a practical effect of making search more effective by reducing the search 
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space or removing the inherent bias. In the example of grouping, we newly reinterpreted geo- 
metric crossover [T7] that was previously proposed by the authors to be theoretically complete. 
In the examples of graphs, symmetric functions, and circular permutations, we induced new 
crossovers better tailored to phenotype space using the proposed methodology. In the example 
of sequence, we successfully analyzed previous study [23] in view of our quotient theory though 
it is slightly escaped from the framework we presented. 

In future work, we will test the proposed induced quotient crossovers in solving the problems 
using genetic algorithms. Also, more examples and applications for each example case are left 
for future study. 
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